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ABSTRACT 

The modeling of longitudinal and multilevel data 
using a latent variable framework is reviewed. Particular emphasis is 
placed on growth modeling. Examples are discussed where repeated 
observations are made on students sampled within classrooms and 
schools. The concept of a latent variable is a convenient way to 
represent statistical variation not only in conventional psychometric 
terms with respect to constructs measured with error, but also in the 
context of models with random coefficients and variance components. 
These features are explored. The random coefficient feature is shown 
to be a useful way to study change and growth over time, while the 
variance component feature is shown to correctly reflect common 
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LATENT VARIABLE MODELING OF LONGITUDINAL 
AND MULTILEVEL DATA 

Bengt Muthen 

CRESSTAJniversity of California, Los Angeles 
Graduate School of Education & Information Studies 

Abstract 

An overview is given of modeling of longitudinal and multilevel 
data using a latent variable framework. Particular emphasis is 
placed on growth modeling. Examples are discussed where repeated 
observations are made on students sampled within classrooms and 
schools. 

1. Introduction 

The concept of a latent variable is a convenient way to represent statistical 
variation not only in conventional psychometric terms with respect to constructs 
measured with error, but also in the context of models with random coefficients 
and variance components. These features will be studied in this paper. The 
random coefficient feature is shown to present a useful way to study change and 
growth over time. The variance component feature is shown to correctly reflect 
common cluster sampling procedures. 

This paper gives an overview of some aspects of latent variable modeling in 
the context of growth and clustered data. Emphasis will be placed on the 
benefits that can be gained from multilevel as opposed to conventional modeling, 
which ignores the multilevel data structure. Data from large-scale educational 
surveys will be used to illustrate the points. 



^ Invited paper for the annual meeting of the American Sociological Association, Section on 

Methodology, Showcase Session. 

2 I thank Ginger Nelson Goff for expert assistance. 



ERLC 



4 



2 



CRESST Final Deliverable 



The paper is organized as follows. Sections 2-6 will discuss theory and 
Sections 7 and 8 applications. To save space, the theory sections will by 
necessity be terse. Some results are given for easy reference and the reader is 
referred to previous papers for the modeling rationale and the derivations of 
estima^tors (see, e.g., Muthen, 1990, 1992, 1994a, 1994b). In Section 2, 
aggregated versus disaggregated modeling will be discussed. Section 3 discusses 
intraclass correlations and design effects in the context of a two-level latent 
variable model. In Section 4, a two-level latent variable model and its estimation 
for continuous-normal data will be presented as a basis for analyses. In Section 
5, it is shown how a three-level model can be applied to growth modeling and 
how it can be re-formulated as a two-level model. Section 6 shows how this 
modeling can be fit into the two-level latent variable framework. It is shown 
that the estimation can be carried out by conventional structural equation 
modeling software. The remaining sections present applications. Section 7 uses 
two-wave data on mathematics achievement for students sampled within 
classrooms. Section 7.1 discusses measurement error when data have both 
within- and between-group variation and gives an example of estimating 
reliability for multiple indicators of a latent variable. Section 7.2 uses the same 
example to discuss change over time in within- and between-group variation 
taking unreliability into account. Section 8 takes the discussion of change over 
time further using a four-wave data set on students sampled within schools. 
Here, a growth model is formulated for the relationships between socio-economic 
status, attitude towards math, and mathematics achievement. Issues related to 
the assessment of stability and cross-lagged effects are also discussed. 

2. Aggregated Vtsrsus Disaggregated Modeling 

Consider the following two-level data structure. Let ygi denote a p- 
dimensional vector for randomly sampled groups and randomly sampled 
individuals within each such group and decompose the ygi into between- and 
within-group variation. 



and consider the decomposition of the corresponding (total) covariance matrix 
into a within- and a between-group part, 



(1) 



(2) 
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In a typical educational example, refers to student-level variation and 
refers to class-level or school-level variation. It is assumed that parameters 
of the covariance matrices capture the essential aspect of the data. In line with 
Muthen and Satorra (1993) (see also Skinner, Holt, & Smith, 1989) we will use 
the term "aggregated modeling" when the usual sample covariance matrix St is 
analyzed with respect to parameters of and ^^disaggregated modeling" when 
the analysis refers to parameters of Sw and Z^. In our terms, a multilevel model 
is a disaggregated model for multilevel data. Such data can, however, also be 
analyzed by an aggregated model, that is^ a model for the total covariance matrix 
Z7. In terms of estimating parameters and drawing inferences, multilevel 
data present the usual complications of correlated observations due to cluster 
sampling. Special procedures are needed to properly compute standard errors of 
estimates and chi-square tests of model fit. Effects of ignoring the multilevel 
structure and using conventional procedures for simple random sampling are 
illustrated in the next section in the context of a latent variable model. The 
model is, however, that of a conventional analysis in that the usual set of latent 
variable parameters are involved. In a disaggregated (or multilevel) model the 
aspiration level is higher in that the parameters themselves change from those 
of the conventional analysis. A much richer model with both within and between 
parameters is used to describe both individual- and group- level phenomena. 

A theme in our discussion is the comparison of analysis and Zw analysis 
with respect to the magnitude of estimates. This comparison has a strong 
practical flavor because if the differences are small, the multilevel aspects of the 
data can be ignored apart from perhaps small corrections of standard errors and 
chi square. This is frequently the case. Even in such cases, however, there may 
be information in the data that can be described in interesting ways by 
parameters of Z^. In other words, the most frequent shortcoming when ignoring 
the multileve] structure of the data is not what is misestimated but what is not 
learned. 

3. Design Effects 

Drawing on Muthen and Satorra (1993), this section gives a brief overview 
of effects of the cluster sampling in multilevel data on the standard errors and 
test of model fit used in conventional covariance structure analysis assuming 
simple random sampling. 
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Consider the well-known design effect (deff) formula for the variance 
estimate of a mean with cluster size c and intraciass correlation p, 

Vc/VsRs^' l-^ic-Dp (3) 

where Vc is the (true) variance of the estimator under cluster sampling and Vsrs 
is the corresponding (incorrect) variance assuming simple random sampling 
(Cochran, 1977). The intraciass correlation (ice) is defined as the amount of 
between-group variation divided by the total amount of variation (between plus 
within). This formula points out that the common underestimation of standard 
errors when incorrectly assuming SRS is due to the combined effects of group 
size (c) and ice's (p's). Given that educational data often have large groups sizes 
in the range of 20-60, even a rather small ice value of 0.10 can have huge effects. 
However, it is not clear how much guidance, if any, this formula gives in terms of 
multivariate analysis and the fitting of latent variable models (see also Skinner, 
Holt, & Smith, 1989). Muthen and Satorra (1993) carried out a Monte Carlo 
study to shed some light on the magnitude of these effects. 

In our experience with survey data, common values for the ice's range from 
0.00 to 0.50 where the higher range values have been observed for educational 
achievement test scores and the lower range for attitudinal measurements and 
health-related measures. Both the way the groups are formed and the content of 
the variables have major effects on the ice's. Groups formed as geographical 
segments in alcohol use surveys indicated ice's in the range of 0.02 to 0.07 for 
amount of drinking, alcohol dependence, and alcohol abuse. Equally low values 
have been observed in educational surveys when it comes to attitudinal variables 
related to career interests of students sampled within schools. In contrast, 
mathematics achievement scores for U.S. eighth graders show proportions of 
variance due to class components of around 0.30-0.40 and due to school 
components of around 0.15-0.20. 

Muthen-Satorra generated data according to a ten-variable multilevel 
latent variable model with a two-factor simple structure. This is a 
disaggregated model of the kind described above. In this case, the loading 
matrices are equal across the two levels, I^b = ^w> which means that the same 
covariance structure model holds on all three levels: within, between, and total. 
Conventional analysis of the total matrix can then be studied in a case where the 
model is correct, but standard errors and test of model fit are not. Data were 
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generated as 200 randomly generated groups and group sizes (total sample size) 
7 (1,400), 15 r3,000), 30 (6,000), and 60 (12,000). These are common values in 
educational achievement surveys. One thousand replications were used. 

Table 1 gives chi-square test statistics for a conventional analysis 
incorrectly assuming simple random sampling. The model has 34 degrees of 
freedom. Using the terms above, this is an analysis of an aggregated model 
using the usual sampe covariance matrix St- The within and between 
parameters are not separately estimated, but only the parameters of the total 
matrix. It is seen that an inflation in chi-square values is obtained both by 
increasing group size and increasing ice's, implying that models would be 
unnecessarily rejected. Only for small values of the ice's and the group size 
might the distortion be ignorable, such as for the combinations (0.005, 7), (0.05, 
15), and (0.10, 7). Judging from this table it seems that even for a rather small 

Table 1 

Chi-Square Testing With Cluster Data 

Group size 

Intraclass correlation 7 15 30 60 

0.05 

Chi-square 
Mean 
Var 
5% 
1% 

0.10 

Chi-square 
Mean 
Var 
5% 
1% 

0.20 

Chi-square 
Mean 
Var 

5% 
1% 



35 

68 
5.6 
1.4 



36 

72 
7.6 
1.6 



38 
80 
10.6 
2.8 



41 
96 
20.4 
7.7 



36 

75 
8.5 
1.0 



40 
89 
16.0 
5.2 



46 
117 
37.6 
17.6 



58 
189 
73.6 
52.1 



42 
100 
23.5 
8.6 



52 
152 
57.7 
35.0 



73 
302 
93.1 
83.1 



114 

734 
99.9 
99.4 



8 
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ice of 0.10, the distortions may be large if the group size exceeds 15. The 
standard errors of the estimates show an analogous pattern in terms of deflated 
values. Muthen-Satorra go on to show how standard errors and chi-square tests 
of fit can be corrected by taking the clustering mto account. They also show that 
the ML estimator of the disaggregated, multilevel model performs well, but the 
estimator does have problems of convergence at small ice values and small group 
si",es and is also sensitive to deviations from normality. In the normal case with 
icc*s of 0.10 and groups sizes ranging from 7 to 60, the multilevel ML estimator 
also performs well when the number of groups is reduced from 200 to 50. In our 
experience, reducing the number of groups much below 50 does not give 
trustworthy results by this estimator. 

We conclude from these simulations that ignoring the multilevel nature of 
the data and carrying out a conventional covariance structure analysis may very 
well lead to serious distortions of conventional chi square tests of model fit and 
standard errors of estimates. 

4. A Two-Level (Disaggregated) Model 

This section gives a brief review of the theory for two-level modeling and 
estimation. Specific latent variable models are not discussed here. The specific 
latent variable model used in growth modeling is given in the next section where 
it is shown how it fits into the framework given in the present section. 

In Hne with McDonald and Goldstein (1989) and Muthen (1989, 1990), 
assume = 1, 2, G independently observed groups with f = 1, 2, Ng 
individual observations within group g. Let z and y represent group- and 
individual-level variables, respectively. Arrange the data vector for which 
independent observations are obtained as 



where we note that the length of d^ varies across groups. The mean vector and 
covariance matrix are 



(4) 



(5) 
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^ IWg (8) lyz 



symmetric 

I^g (8) Im; + l/Zg Ia^s ® Sfi 



(6) 



Muthen (1994a, pp. 378-382) discusses the above covariance structure and 
contrasts it with that of conventional covariance structure analysis. 

Assuming multivariate normality of dg, the ML estimator minimizes the 
function 



F = I {loglZdJ + (dg - ,udp' Sd^Cdg - ^j} 



(7) 



Here, the parameter arrays are potentially of large size if there are many 
individuals per group. A remarkable simplification which makes the sizes not 
depend on group size is given as (cf. McDonald & Goldstein, 1989; Muthen, 1989, 
1990) 



F = G, { In II. I + tr [ig' (5. + (v^ - ^) (v, - } 



+ (iV-G){lnEv^,l + fr[Z~vj5pJ} 



(8) 



where 



^ Izz symmetric ^ 



Sb, = g;' z 



}'rf/t - yd 



iizdk - ZdYiydk-ydY] 



'Zd 



yd - i" 



y J 



g=i 1=1 



}'J (>'i.. - yy 
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Here, D denotes the number of groups of a distinct size, d is an index 
denoting a distinct group size category with group size JVd, Gd denotes the 
number of groups of that size, Sb^ denotes a between-group sample covariance 
matrix, and Spw is the usual pooled-within sample covariance matrix. 

Muthen (1989, 1990) pointed out that the minimization of the ML fitting 
function defined by equation 8 can be carried out by conventional structural 
equation modeling software, apart from a slight modification due to the 
possibility of singular sample covariance matrices for groups with small Gd 
values. A multiple-group analysis is carried out for D + 1 groups, the first D 
groups having sample size Gd and the last group having sample size N - G, 
Equality constraints are imposed across the groups for the elements of the 
parameter arrays jU, Z^^, Tyz, Zw'. To obtain the correct chi-square test of 
model fit, a separate Hi analysis needs to be done (see Muthen, 1990 for details). 

Muthen (1989, 1990) also suggested an ad hoc estimator which considered 
only two groups, 

r= G{\n\I.Bj+tr[rg^{SB + c {v - ji) (v - ^Y)]} 

+ (iV-G){InlZv^l+fr[Z'^5pv^]} (9) 

where the definition of the terms simplifies relative to equation 14 due to 
ignoring the variation in group size, dropping the d subcript, and using Z) = 1, Gd 
= G, and Nd = c, where c is the average group size (see Muthen, 1990 for details). 
When data are balanced, that is, the group size is constant for all groups, this 
gives the ML estimator. Experience with the ad hoc estimator for covariance 
structure models with unbalanced data indicates that the estimates, and also the 
standard errors and chi-square test of model fit, are quite close to those obtained 
by the true ML estimator. This observation has also been made for growth 
models where a mean structure is added to the covariance structure, see Muthen 
(1994b). 

In Section 6 we will return to the specifics of how the mean and covariance 
structure of equations 8 and 9 can be represented in conventional structural 
equation modeling software for the case of growth modeling. The growth model 
will be presented next. 
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5. A Three-Level Hierarchical Model 

Random coefficient growth modeling (see, e.g., Laird & Ware, 1982), or 
multilevel modeling (see, e.g.. Bock, 1989), describes individual differences in 
growth. In this way, it goes bf;yond conventional structural equation modeling of 
longitudinal data and its focus on auto-regressive models vsee, e.g. Joreskog & 
Sorbom, 1977; Wheaton, Muthen, Alwin, & Summers, 1977). Random-coefficient 
modeling for three-level data is described, for example, in Goldstein (1987), Bock 
(1989), and Bryk and Raudenbush (1992). 

Consider the three -level data 



Group 

(School, class) 

Individual 

Time 



g = 1,2,...,G 

i = 1,2,...,^ 
t = l,2,...,T 



ygit 

Xit 

"git 



w 



gl 



individual-level, outcome variable 
individual-level, time-related variable (age, grade) 
individual-level, time-varying covariate 
individual-level, time-invariant covariate 
group-level variable 



and the growth equation, 

ygit = ^gi + Pgi^it + Ygit "git + ^git 



^1 



An important special case that will be the focus of this paper is where the time- 
related variable Xii = Xf An example of this is educational achievement studies 
where xt corresponds to grade. The Xf values are for example 0, 1, 2, T-1 for 
linear growth. We will also restrict attention to the case of Ygij = Ygf. The three 
levels of the growth model are then 



ygit = «g/ +^r% + Ygt^git + ^git 



(11) 



a 



gt 



+ 5, 



a 



(12) 
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«g = « 4- Jt^WBg + + dag 



(13) 



v/here the variation in the individual-level, time-invariant covariate Wgi is 
decomposed into between- and within-group parts 



(14) 



In the case of growth modeling using a simple random sample of 
individuals, it is possible to translate the growth model from a two-level model to 
a one-level model by considering a T x 1 vector of outcome variables y for each 
individual. Analogously, we may reduce the three-level model to two levels as 
follows. 



Ygi 



Vgif J 



= [Ix] 



K ^gi J 



(15) 



which may be expressed in five terms 

The first term represents the mean as a function of the mean of the initial 
status and the mean of the growth rate. The second and third terms correspond 
to between-group (school) variation. The fourth and fifth terms correspond to 
within-group variation. 

6. Latent Variable Formulation 

For the case of simple random sampling of individuals, Meredith and Tisak 
(1984, 1990) have shown that the random coefficient model of the previous 
section can be formulated as a latent variable model (for applications in 
psychology, see McArdle & Epstein, 1987; for applications in education, see 
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Muthen, 1993 and Willett & Sayer, 1993; for applications in mental health, see 
Muthen, 1983, 1991). The basic idea can be simply described as follows. In 
equation 1, at is a latent variable varying across individuals. Assuming the 
special case of xu = xt, the x variable becomes a constant which multiplies a 
second latent variable /Jj. 

The latent variable formulation can be directly extended to the three-level 
data case. In line with Muthen (1989, 1990), Figure 1 shows a path diagram 
which is useful in implementing the multilevel estimation using the multilevel 
fitting function F or F'. The figure corresponds to the case of no covariates v,w, 
or z. It shows how the covariance structure 

^Ndl^B (17) 

can be represented by latent variables, introducing a latent between-level 
variable for each outcome variable y. On the within side, we note that the a 
factor influences theys with coefficients 1 at all time points. The constants ofxt 
are the coefficients for the influence of the P factor on the y variables. This 
makes it clear that non-linear growth can be accommodated by estimating the xt 
coefficients, for example, holding the first two values fixed at 0 and 1, 
respectively, for identification purposes. The within-level a and ft factors 
correspond to the 5aig and Sp^g residuals of equation 12. The between-level a 
and P factors correspond to the 5ag and dpg residuals of equation 13. From 
equation 16 it is clear that the influence from these two factors is the same on 
the between side as it is on the within side. Corresponding to this, in Figure 1 
the structure is identical to the structure. A strength of the latent 
variable approach is that this equality assumption can easily be relaxed. For 
example, it may not be necessary include between-group variation in the growth 
rate. These latent between-level variables may also be related to observed 
between-level variables Zg as in Section 4. 

A special feature of the growth model is the mean structure imposed on fi in 
the ML fitting function of equation 8, where represents the means of group- 
and individual-level variables. In the specific growth model shown in Figure 1, 
the mean structure arises from the five observed variable means being expressed 
as functions of the means of the a and /3 factors, here applied on the between 
side, see equation 16. Equation 8 indicates that the means need to be included 
on the between side of Figure 1 given that the mean term of F is scaled by Ndy 
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between. 



ktweei 



BETWEEN 




^^^Y3\ ^(^'Y4>\ > Y5 

, Between / VBctwccn I VBctween , 



Yl ' 




Y2 




Y3 




Y4 1 


Y5 


Grade? 


\ 


Grades 


k 


Grade 9 




Grade lol 


Grade 11 




WITHIN 



Figure 1 . Latent variable growth model formulation for two-level, five-wave data. 
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while the means on the within side are fixed at zero. This impHes that dummy 
zero means are entered for the within group. The degrees of freedom for the chi- 
square test of model fit obtained in conventional software then needs to be 
reduced by the number ofy variables. 

Further details and references on latent variable modeling with two-level 
data are given in Muthen (1994b), also giving suggestions for analysis strategies. 
Software is available from the author for calculating the necessary sample 
statistics, including intraclass correlations. 

It is clear that the Figure 1 model can be easily generalized to applications 
with multiple indicators of latent variable constructs instead of single outcome 
measurements y at each time point. The covariates may also be latent variables 
with multiple indicators. Estimates may also be obtained for the individual 
growth curves by estimating the individual values of the intercept and slope 
factors a and j3. This relates to Empirical Bayes estimation in the conventional 
growth literature (see, e.g.. Bock, 1989). 

7. Analysis of Two-Wave Achievement Data 

We will first consider data from the Second International Mathematics 
Study (SIMS; Crosswhite et al., 1985) drawing on analyses presented in Muthen 
(1991, 1992). Here, a national probability sample of school districts was selected 
proportional to size; a probability sample of schools was selected proportional to 
size within school district, and two classes were randomly drawn within each 
school. The data consist of 3,724 students observed in 197 classes from 113 
schools with class sizes varying firom 2 to 38 with a typical value of around 20. 
Eight variables are considered corresponding to various areas of eighth-grade 
mathematics. The same set of items were administered as a pretest in the fall of 
eighth grade and again as a posttest in the spring. 

Muthen (1991) poses the following questions: 

The substantive questions of interest in this article are the variance decomposition of 
the subscores with respect to within-class student variation and between-class 
variation and the change of this decomposition from pretest to posttest. In the SIMS 
,.. such variance decomposition relates to the effects of tracking and differential 
curricula in eighth-grade math. On the one hand, one may hjrpothesize that effects of 
selection and instruction tend to mcrease between-class variation relative to within- 
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class variation, assuming that the classes are homogeneous, have different 
performance levels to begin with, and show faster growth for higher initial 
performance level. On the other hand, one may hypothesize that eighth-grade 
exposure to new topics will increase individual differences among students within 
each class so that posttest within-class variation will be sizable relative to posttest 
between-class variation. 

7.1 Measurement error and reliability of multiple indicators 

Analyses addressing the above questions can be done for overall math 
performance, but it is also of interest to study if the differences vary from more 
basic to more advanced math topics. For example, one may ask if the differences 
are more marked for more advanced topics. When focusing on specific subsets of 
math topics, the resulting variables consist of a sum of rather few items and 
therefore contain large amounts of measurement error. At Grade eight, the 
math knowledge is not extensively differentiated and a unidimensional latent 
variable model may be formulated to estimate the reliabilities for a set of such 
variables. Muthen (1991) formulated a multilevel factor analysis model for the 
two-wave data. Given that the amount of across-school variation was small 
relative to the across-classroom variation, the school distinction was ignored and 
the data analyzed as a two-level structure. At each time point unidimensionality 
was specified for both within- and between-class variation, letting factors and 
measurement errors correlate across time on each level. Table 2 presents 
estimates from both the multilevel factor analysis (MFA) model (see the Within 
and Between columns) and a conventional analysis (see the Total columns). 
Reliability is estimated from the factor model as the proportion of variance in the 
indicator accounted for by the factor. As is seen from Table 2 the estimated 
student-level (within) reliabilities are considerably lower than reliabilities 
obtained from a total analysis. 

In psychometrics it is well-known that reliabilities are lower in more 
homogeneous groups (Lord & Novick, 1968). Here, however, it seems important 
to make the distinction shown in Figure 2. 

The top panel of Figure 2 corresponds directly to the Lord and Novick case. 
The three line segments may be seen as representing three different classrooms 
with different student factor values Tj and student test score values y. The 
regression line for all classrooms is given as a broken line. All classrooms have 
the same intercept and slope. For any given classroom, the range of the factor is 
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Table 2 

The Second International Mathematics Stt.dy: Analysis of Math Achievement From Two Time 
Points 



Reliabilities 









Pretest 






Posttest 




Variables 


# Items 


Total 


MFA 
Within 


MFA 


Total 


MFA 

Within 


MFA 


RPP 


8 


.61 


.44 


.96 


.68 


.52 


.97 


FRACT 


8 


.60 


38 


.97 


.68 


.49 


.98 


EQ EXP 


6 


.36 


.18 


.83 


.55 


.32 


.92 


INTNUM 


2 


.34 


.18 


.81 


.43 


.25 


,88 


STESTI 


5 


.44 


.25 


.86 


.52 


.34 


.89 


AREAVOL 


2 


.29 


.18 


.82 


.38 


.23 


.84 


COORVIS 


3 


.34 


.18 


.92 


.42 


.26 


.80 


PFIGURE 


5 


.32 


.17 


.78 


.46 


.31 


.77 



restricted and due to this restriction in range the reliability is attenuated 
relative to that of all classrooms. 

The bottom panel of Figure 2 probably corresponds more closely to the 
situation at hand. Here, the three classrooms have the same slopes but different 
intercepts. The regression for the total analysis is marked as a broken line. It 
gives a steeper slope and a higher reliability than for any of the classrooms. One 
can argue, however, that the higher reliability is incorrectly obtained by- 
analyzing a set of heterogeneous subpopulations as if they were one single 
population (cf. Muthen, 1989). In contrast, the multilevel model captures the 
varying intercepts feature and reveals the lower within reliability which holds 
for each classroom. 

The Table 2 between reliabilities are considerably higher than the within 
values. These between coefficients concern reliable variation across classrooms 
and therefore have another interpretation than the student-level reliabilities. 
The results indicate that what distinguishes classrooms with respect to math 
performance is largely explained by a single dimension, that is, a total score, and 
that on the whole the topics measure this dimension rather similarly. 
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7.2 Attenuation of intraclass correlations by measurement error 

We will consider the size of the intraclass correlations as indicators of 
school heterogeneity. This can be seen as a function of social stratification giving 
across-school differences in student "intake," as well as differences in the 
teaching and what schools do with a varied student intake. The U.S. math 
curriculum in Grades 7-10 is very varied with large differences in emphasis on 
more basic topics such as arithmetic and more advanced topics such as geometry 
and algebra. Ability groupings ("tracking") are often used. In some other 
countries, however, a more egalitarian teaching approach is taken, the 
curriculum is more homogeneous, and the social stratification less strong. In 
international studies, the relative sizes of variance components for student, 
class, and school are used to describe such differences (see, e.g., Schmidt, Wolfe, 
& Kifer, 1993). 

Table 3 gives conventional variance component results from nested, 
random-effects ANOVA in the form of the proportion of variance between 
classrooms relative to the total variance. This is the same as the intraclass 
correlation measure. It is seen that the intraclass correlations increase from 
pretest to posttest. The problem with these values are, however, that they are 
likely to be attenuated by the influence of measurement error. This is because 

Table 3 

The Second International Mathematics Study: Analysis of Math 
Achievement From Two Time Points 

Intraclass Correlations 
(proportion between classroom variance) 



ANOVA MFA 



Variables 


# Items 


Pre 


Post 


Pre 


Post 


RPP 


8 


.34 


.38 


.54 


.52 


FRACT 


8 


.38 


.41 


.60 


.58 


EQ EXP 


6 


.27 


.39 


.65 


.64 


INTNUM 


2 


.29 


.31 


.63 


.61 


STESTI 


5 


.33 


.34 


.58 


.56 


AREAVOL 


2 


.17 


.24 


.54 


.52 


COORVIS 


3 


.21 


.32 


.57 


.55 


PFIGURE 


5 


.23 


.33 


.60 


.54 
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student-level measurement error adds to the within-part of the total variance, 
that is, the denominator of the intraclass correlation. The distortion is made 
worse by the fact that the student-level measurement error is likely to decrease 
from pretest to posttest due to more familiarity with the topics tested. 

The MFA columns of Table 3 give the multilevel factor analysis assessment 
of intraclass correlations using the one-factor model in the previous subsection. 
Here, the intraclass correlations are computed using the between and within 
variances for the factor variable, not including measurement error variance. It is 
seen that these intraclass correlations are considerably higher and indicate a 
slight decrease over time. This is a change in the opposite direction from the 
ANOVA results. Results from ANOVA would therefore give misleading evidence 
for answering the questions posed in Muthen (1991). 

8. Analysis of Four-Wave Data by Growth Modeling 

The Longitudinal Study of American Youth (LSAY) is a national study of 
performance in and attitudes towards science and mathematics. It is conducted 
as a longitudinal survey of two cohorts spanning Grades 7-12. LSAY uses a 
national probability sample of about 50 public schools, testing an average of 
about 50 students per school every fall starting in 1987. Data from four time 
points. Grades 7-10, and one cohort will be used to illustrate the methodology for 
analysis of individual differences in growth. 

In this analysis, mathematics achievement and attitudes toward math will 
be related to each other and to socio-economic status (SES) of the family. The 
data to be analyzed consist of a total sample of 1,869 students in 50 schools with 
complete data on all variables in the analysis. Mathematics achievement is 
quantified as a latent variable (theta) score obtained by IRT techniques using 
multiple test forms and a large number of items including arithmetic, geometry, 
and algebra. The intraclass correlations for the math achievement variable for 
the four grades are estimated as 0.18. 0.13, 0.15, 0.14, indicating a noteworthy 
degree of across-school variation in achievement. Attitude toward math was 
measured by a summed score using Items having to do with how hard the 
student finds math, whether math makes lli'i student anxious, whether the 
student finds math important, etc. As expected, the intraclass correlations for 
the attitude variable are considerably lower than for achievement. They are 
estimated as 0.05, 0.06, 0.04, 0.02. The Pearson product-moment correlations 
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between achievement and attitude are estimated as 0.4-0.6 for each of the four 
time points. The measure of socio-economic status pertains to parents' 
educational ijvels, occupational status, and the report of some resources in the 
home. It has an intraclass correlation of 0.17. 

The analysis considers a growth model extending the single-variable, two- 
level growth model of Figure 1 to a simultaneous model of the growth process for 
both achievement and attitude. SES will be used as a student-level, time- 
invariant covariate, explaining part of the variation in these two growth 
processes. No observed variables on the school level will be used. The model is 
described graphically in Figure 3. 

Let the top row of observed variables (squares) represent achievement at 
each of the four time points and the bottom row the corresponding attitudes. 
The SES covariate is the observed variable to the left in the figure. 

Consider first the student- (within-) level part of Figure 3. The latent 
variable (circle) to the right of the observed variable of SES is hypothesized to 
influence four latent variables, the intercept (initial status) factor and slope 
(growth rate) factor for achievement (the top two latent variables) and the 
intercept and slope factors for attitude (the bottom two latent variables). The 
intercept for each growth process is hypothesized to have a positive influence on 
the slope of the other growth process. In order not to clutter the picture, 
residuals and their correlations are not drawn in the figure, but a residual 
correlation is included for the intercepts as well as the slopes. For each growth 
process, the model is as discussed in connection with Figure 1. Preliminary 
analyses suggest that nonlinear growth for achievement should be allowed for by 
estimating the growth steps fi*om Grade 8 to 9 and fi:'om Grade 9 to 10, while for 
attitude a linear process is sufficient. In fact, for attitude, a slight decline is 
observed over time. The reason for this is not clear, but does perhaps reflect that 
among a sizeable part of the student population there is an initial positive 
attitude about math which wears off over the grades either because math gets 
harder or because they stop taking math. For each growth process, correlations 
are allowed for among residuals at adjacent time points. Residual correlations 
are also allowed for across processes at each time point. Cross-lagged effects are 
allowed for as indicated in the figure. It should be noted, however, that even 
without cross-lagged effects the model postulates that achievement and attitude 
do influence each other via their growth intercepts and slopes. For example, if 
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Figures. Two-level, four-wave growth model for achievement and attitude related to socio- 
economic status. 
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the initial status factor for attitude has a positive influence on the growth rate 
factor for achievement, initial attitude has a positive influence on later 
achievement scores. 

The hierarchical nature of the data is taken into account by inclusion of the 
between- (school-) level part of the modeL The between-level part of Figure 3 is 
similar to the within-level part. Startmg with the SES variable to the left in the 
figure, it is seen that the variation in this variable is decomposed into two latent 
variables, one for the within variation and one for the between variation (the 
between factor is to the left of the SES square). At the top and the bottom of the 
figure are given the between-level intercept and slope factors for achievement 
and attitude, respectively. As in Figure 1, the influence of these factors on 
achievement/attitude is specified to have the same structure and parameter 
values as for the within-part of the model. A minor difference here is that the 
intercept for one process is not specified to influence the slope of the other 
process, but all four intercept and slope factor residuals are instead allowed to be 
freely correlated. Also, on the between side, correlations among adjacent 
residuals over time are not included in the model. 

As a comparison to the above growth model, a more conventional auto- 
regressive, cross-lagged model will also be analyzed. This is shown in Figure 4 
in its two-level form. On the within level, the figure shows a lag one auto- 
regressive process for both achievement and attitude with lag-one cross-lagged 
effects, where SES is allowed to influence the outcomes at each time point. The 
between-level part of the model is here not given a specific structure but the 
between-level covariance matrix is made unrestricted by allowing all between- 
level factors to freely correlate. 

For simplicity in the analyses to be presented, the two-group ad hoc 
estimator discussed in Section 4 will be used and not the full-information 
maximum-likelihood estimator. This means that the standard errors and chi- 
square tests of model fit are not exact but are approximations; given our 
experience they are presumably quite reasonable ones. Consequently, 
statements about significance and model fit should not be interpreted in exact 
terms. 

It is of interest to first ignore the hierarchical nature of the data and give 
the incorrect tests of fit for the single-level analogs of the auto-regressive and 
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Figure 4. Two-level, four- wave auto-regressive model for achievement and attitude related 
to socio-economic status. 



ERIC 



growth models. To this aim, the conventional maximum-likelihood fitting 
function is used. The lag-one auto-regressive model obtained a chi-square value 
of 534.7 with 12 degi^ees of freedom. To improve fit it was necessary to include a 
lag-three model for the auto-regressive part and this gave a chi-square value of 
22.3 with 6 df. The correct two-level tests of fit using the lag-one model of Figure 
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4 resulted in a chi-square value of 518.8 with 12 df, while a two-level, lag-three 
model gave a chi-square value of 28.1 with 6 df. The degrees of freedom are the 
same for the single-level and two-level models because the two-level model 
doubles the number of. parameters as well as the number of sample variances 
and covariances that are analyzed (a mean structure is not involved in th^''^. 
model). The two-level, lag-three model shows positive and significant student- 
level cross-lagged effects of achievement and attitude on each other. The lag- 
three auto-regressive structure of the model, however, makes it a rather complex 
and unelegant representation of the data. 

Turning to the growth model, the single-level model which ignores the 
hierarchical nature of the data obtained the incorrect chi-square value of 44. C 
with 8 df (p < 0.001). The two-level model obtained the chi-square value of 68.4 
with 39 df ip = 0.003). This may peihaps be regarded as a reasonable fit at n = 
1,869. The estimates of this model are given in Table 4. 

What is particularly interesting about the two-level growth model is that in 
contrast to the auto-regressive model, none of the student-level cross-lagged 
effects are significantly different from zero. This makes for a very parsimonious 
model where the achievement and attitude processes are instead correlated via 
the correlations among their intercept and slope factors. The correlation 
between the intercept factors (not shown in the table) is positive (0.27) while the 
slope factor correlation is ignorable (0.08). The influences from the intercepts to 
the slopes turn out to be not significant. 

The student-level influence from SES is significantly positive for both the 
achievement and attitude intercepts. It is insignificant for the achievement 
slope and significantly negative for the attitude slope. It is not clear what the 
negative effect represents, but this effect would be seen if students from high 
SES homes have a strong initial positive attitude which later becomes less 
positive. SES explains 12% of the student variation in the achievement intercept 
while it explains only 1% of the student variation in the attitude intercept. In 
terms of the achievement growth, the estimates indicate that relative to the 
positive growth from Grade 7 to 8, the growth is accelerated in later grades. For 
attitude, linear growth is maintained. 
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Table 4 

Results From Two-Level Random Coefficient Growth Model (n = 1,869) 



(39) = 68.38 
Parameter estimates i-values 



Within 

Cross-lags 

Timepoints 

Achievement -» Attitude 
Grade 7 -» Grade 8 
Grade 8 -» Grade 9 
Grade 9 -» Grade 10 

Attitude -» Achievement 
Grade 7 Grade 8 
Grade 8 -» Grade 9 
Grade 9 -» Grade 10 

Growth Model 

Achievement Initial Status 
-» Attitude Growth Rate 
Achievement Initial Status 
-» Attitude Growth Rate 

Effects of SES on 

Achievement 

Initial Status 
Growth Rate 
Attitude 

Initial Status 
Growth Rate 
Factor Residual (Co) Variances 
Achievement 

Initial Status 
Growth Rate 

Initial Status, Growth Rate 
Attitude 

Initial Status 
Growth Rate 

Initial Status, Growth Rate 
Achievement, Attitude 

Initial Status 

Growth Rate 
Initial Status Intercept 
Achievement 
Attitude 



-0.001 

-0.01 

-0.01 

0.04 

-0.15 
-0.15 

0.003 
0.23 



2.93 
0.16 

0.29 
-0.08 



57.84 
1.16 

I. 57 

4.24 
0.71 
-0.80 

4.38 
0.28 

52.47 

II. 36 



-O.OS 
-0.79 
-0.45 

0.54 
-1.74 
-0.85 

0.39 
1.29 



10.21 
1.84 

3.54 
-2.31 



14.50 
2.17 
1.16 

1.33 
0.67 
-0.50 

6.71 
1.06 

117.38 
117.85 
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Table 4 (continued) 



^ (39) = 68.38 
Parameter estimates 



r-values 



Growth Curve 

Achievement 

7th Grade 

8th Grade 

9th Grade 

10th Grade 
Attitude 

7th Grade 

8th Grade 

9th Grade 

10th Grade 
Growth Rate Intercept 
Achievement 
Attitude 

Between 

Effects ofSES on 

Achievement 

Initial Status 
Growth Rate 
Attitude 

Initial Status 
Growth Rate 
Factor Residual (Co) Variances 
Achievement 

Initial Status 
Growth Rate 

Initial Status, Growth Rate 
Attitude 

Initial Status 
Growth Rate 

Initial Status, Gro.vth Rate 
Achievement, Attitude 

Initial Status 
Growth Rate 

Initial Status, Growth Rate 
Growth Rate, Initial Status 



0* 
1* 

2.60 
3.85 

0* 
1* 
2* 
3* 

2.37 
-0.32 



7.96 
0.91 

0.31 
0.12 



6.11 
0.08 
0.15 

0.19 
0.02 
-0.03 

0.65 
-0.02 
-0.06 
-0.08 



12.81 
11.86 



9.82 
-9.10 



5.24 
3.31 

0.91 
0.93 



3.38 
1.26 
0.64 

1.18 
0,66 
-0.48 

2.09 
-1.20 
-0.58 
-1.50 



* Parameter is fixed in this model. 
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In the school-level part of the model, the correlation between achievement 
and attitude intercepts (not shown in the table) obtains a rather high value, 0.61 
(the student-level value is 0.27). On the school level it is seen that SES does not 
have a significant influence on the attitude intercept or slope factors. The 
influence on the achievement intercept and slope is, however, significantly 
positive. This reflects across-school heterogeneity in neighborhood resources so 
that schools with higher SES families have both higher initial achievement and 
stronger growth over grades. It is interesting to note that significant student- 
level influence of SES on the student-level achievement growth rate was not 
seen, while strongly significant school-level influence of SES is seen on the 
school-level achievement growth rate. 
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